ORNL Cluster

ORNL Cluster Help

by Jon Tischler

Connecting to remote server
The main dialog.
Starting a job.
Setting the reconstruction parameters.
Set files for 2D data
Setting the peak search parameters.
Setting the indexing parameters.
View the queue on server
Starting and stopping the remote server.

Connect to the Server

The first thing that you will need to do is to connect to the server, otherwise you are just talking to yourself. So go to the 'Server' menu and select one of the listed servers. You will probably want hpcs34, but if you are running the server on your own computer, then you can choose localhost.

Note, you can quit this program at any time and restart & reconnect to the server without affecting any jobs executing on the server in any way.

What is the server doing now?

At the top of the main dialog, the status of the server will be shown. Red means no connection, green means the server is idle, and blue means that the server is busy.

If the server is running (blue), then you can see what is happening by selecting "Server → View Queue on Server". This will put up a window showing what is happening. If nothing is running, then this menu item is disabled.


Setting Up a Job

To actually do some computations you have to provide all sorts of information to let the server know what to do and how to do it. There are various ways to provide this information.

  • The simplest way to set all input parameters is to open a local xml file (on your computer) that has all of the input parameters already in it. To do this go to the menu "File → Open", and select the desired xml file. Later on I will explain how to create such a file.
  • Another easy way to get the required input information is to ask the server. If you go to the menu item "Server → Get All Values from Server", or push the "Get←Server" button, all of the input parameters currently set in the server will be downloaded into this client program.
  • The normal way to provide the required information is by fillng out all of the dialogs. Since this is more time consuming, I suggest first using one of the two previous methods (file or download) to preset everything to reasonable values. Then proceed through all of the dialogs until all parameters are set.
  • When all values are entered and set to your satisfaction and Validated, you can save these settings to an xml file by the menu item "File" → "Save As...". This saved file can then be read in at a later date to either re-do the same calculation or one only slightly different.

  • Main Dialog


    2D/3D button
    3D means there is a wire scan and so a reconstruction should be done. If no reconstruction (no wire scan), then choose 2D.
    Wait for files
    This causes the analysis to analyze the data as it is taken. This is useful when you start the analysis before the measurement is finished. If the data have already been taken, do not check this box. For example, do not check the box if you are re-analyzing old data.
    Index Images
    Causes the images (either reconstructed or not) to be indexed. This needs to be checked if you want to know the orientation of a Laue pattern. If you do not have Laue pictures that can be oriented, then do not check. Usually you will check this box for white images. For monochormatic (or energy scans) you should probably not check this box.
    Skip Reconstruction
    This is only useful for 3D data which have already been reconstructed, and you want to re-do the peaksearch and indexing. For 2D data there is no reconstruction, and for 3D data if you have not yet done the reconstruction then you cannot skip it. But if you have already done the reconstruction and want to re-index (or search for peaks) with different parametrers, then use this checkbox.
    path to binaries
    This is the path to the progam files used by the cluster. So the path is relative to the cluster nodes. The default is ~/bin/. It is likely that this path cannot be even seen by your computer (but the cluster can).
    path to data files
    This is a prefix path that will be prepended to all data files, both input files and output files. It is not required, but it will make things easier for you.
    geometry file:
    This is the path to the geometry file (including the name). The full path to the geometry file from the server's point of view is "path to data files"/"geometry file".

    buttons:

    Help
    Brings up a view of this help.
    Go
    Start analysis running on the server using the supplied parameters. If buttong is disabled (grayed out), then you need to first validate the parameters (see Validate All below).
    Kill Jobs
    Kills all currenly executing and pending jobs on the server. Use this when you just want to stop everyhthing. Using this will often leave some of the intermediate text files lyng around in raw or reconstructed image folders. There is no way to automatically delete these files after pushing "Kill Jobs"
    Quit
    Quits this program.
    Get←Server
    Get all of the input parameters from the server. This sets up everything just as it is in the server, very useful when you re-connect to the server and want to set everything just as it was when you last quit. This is equivalent to the menu item "Server → Get All Values from Server".
    Validate All
    Send ALL values from this program (not just the ones on this panel) to the server and has the server check them for validity. This includes checking that all required paths and (most) of the files exist at the server. Use this just before the 'Go' button. You need to do this before pushing the Go button or doing a "Save As..." from the menu.
    View Queue
    Initially disabled, when the server is running jobs, this button will open the queue activity window. It is equivalent to the
    Server → View Queue on Server menu item.
    Top

    Other Dialogs:

    There are three other dialogs that need to be accessed before starting a calculation. They are accessed from the "Set" menu.
    Set → Reconstruct...
    Set → Set 2D Files...
    Set → Peak Search...
    Set → Indexing...
    Server → View Queue on Server
    Top

    Set → Reconstruct...


    This is used to set the parameters needed for doing a reconstruction (3D) data. For 2D data this panel is not needed.
    file root:
    path to the raw images. This is relative to the "path to data files" from the main panel. For multiple detectors, separate file roots by a comma, e.g. "raw/Si_Orange_37.h5, raw/Si_Yellow_37.h5, raw/Si_Purple_37.h5"
    first file number:
    An integer that designates the first image in the reconstruction.
    last file number:
    An integer that designates the last image in the reconstruction. If you want the computer to figure out where the reconstruction ends, use Inf, and the reconstruction will stop when the computer runs out of input images.
    output folder:
    Relative path to the folder that receives the reconstructed images. For multiple detectors, separate the folders e.g. "recon/, recon/, recon/" or "reconO/, reconY/, reconP/"; putting all files in the same folder is preferred.
    Depth Range, first:
    depth of the first reconstructed image.
    last:
    depth of the last reconstructed image.
    resolution:
    distance (in depth) between each reconstructed image.
    % of pixels to process:
    Most fo the pixels in each Laue picture contain no information, so reconstructing them provides no benefit. For lots of big images, reconstructing all of the pixels in each image will take a long time, this parameter will help speed the process by ignoring the unimportant pixels. The procedure used to decide which pixels are important is as follows: the first image in each wire scan is examined, and the most intense 10% (if this value is set to 10) of the pixels are the only ones that are reconstructed. This parameter is useful for big Laue images. For energy wire scans which usually use only a small region of the detector, use 100%.
    memory to use:
    Amount of memory (in MiB = Megabytes) to use for the reconstruction. Good values are 1024 (equivalend to 1 Gigabyte) for the cluster and 2048 on the Mac.
    Io normaliztion:
    Can be used to normalize the data to a PV that was stored in each image. Often left blank.
    Wire Edge
    Tells the reconstruction which edge(s) of the wire to use. We usually use Both Edges of Wire. This works best unless your sample has good crystalline grains larger than the wire diameter. Then you should use Leading Edge of Wire. The Trailing Edge is only for completeness, and is seldom used.
    output type
    Determines the number type that is used for saving the reconstructed images. Usually use default output type. However, it is possible to save disk space by choosing 2 byte signed int (short). This only works well when the reconstruction does not produce any really strong peaks. If you do not know what you are doing, use the default.
    detector to use
    You can specify the detector that is used. Generally this is a bad idea, the analysis programs should figure this out for themselves based on information in each image. So just choose -1 (auto). However if something is wrong, you can force a particular detector geometry to be used by specifying the detector here. Do not mess with this if you are not sure, just use auto.

    buttons:

    Done
    Dismisses this dialog, all values entered in this dialog will be saved for use. If this button is disabled, try the Validate button.
    Cancel
    Cancel this dialog, do NOT use the values entered here, they are discarded. This is equivalent to just closing the window.
    Validate
    Does a simple check on the entered values. This is only a local check, the server is not involved. You have to do a Validate to activate the Done button.
    Top

    Set → Set 2D Files...


    This is used to set the parameters needed for indexing many image files, probably a 2D scan.
    file root:
    path to the raw images. This is relative to the "path to data files" from the main panel.
    first file number:
    An integer that designates the first image in the reconstruction.
    last file number:
    An integer that designates the last image in the reconstruction. If you want the computer to figure out where the reconstruction ends, use Inf, and the reconstruction will stop when the computer runs out of input images.
    output xml file:
    This is the path to the output xml file (contains result of indexing). The full path of this file from the server's point of view will be "path to data files"/"xml file for output". For 3D data this file name is automatically generated, but for 2D data you need to supply it.
    detector to use
    You can specify the detector that is used. Generally this is a bad idea, the analysis programs should figure this out for themselves based on information in each image. So just choose -1 (auto). However if something is wrong, you can force a particular detector geometry to be used by specifying the detector here. Do not mess with this if you are not sure, just use auto.

    buttons:

    Done
    Dismisses this dialog, all values entered in this dialog will be saved for use. If this button is disabled, try the Validate button.
    Cancel
    Cancel this dialog, do NOT use the values entered here, they are discarded. This is equivalent to just closing the window.
    Validate
    Does a simple check on the entered values. This is only a local check, the server is not involved. You have to do a Validate to activate the Done button.
    Top

    Set → Peak Search...


    This controls the peak searching which is a preliminary to the indexing. It is not used for energy scans (e.g. an energy wire scan).
    box size
    Gives a size used for finding and fitting a peak. Typical values are 5 for sharp peaks, and 18 for larger peaks.
    max R-factor
    Reject peaks whose fit has an R-factor greater than this value. Usually use 0.5.
    threshold
    The threshold used for peak searching. Usually just use nan, and let the program choose one for itself.
    min spot size
    Reject spots with size smaller than this number. For larger spots use 1.1, for sharp peaks or for Gaussian shapes use smaller values such as 0.3.
    min spot distance
    Do not accept peaks that are within this distance from each other. This keeps fitted peaks from being to close together.
    max number of spots
    Used to limit the number of spots found and fitted. Use -1 to accept all. Hoever, big nubmers (>50) can sometimes cause problems with the indexing.
    Smooth before fitting, Check Box
    This does a smooth operation on the image before fitting, not usually needed. This only works when using peaksearchBox.
    Gaussian/Lorentzian
    Function to be used for fitting the peaks. The Gaussian definitely works better for sharp peaks, and is usually a better choice.

    buttons:

    Done
    Dismisses this dialog, all values entered in this dialog will be saved for use. If this button is disabled, try the Validate button.
    Cancel
    Cancel this dialog, do NOT use the values entered here, they are discarded. This is equivalent to just closing the window.
    Validate
    Does a simple check on the entered values. This is only a local check, the server is not involved. You have to do a Validate to activate the Done button.

    programs:

    XY program:
    Name of program (relative to the "binary path") to be used for finding peaks on images. Usually use peaksearch, sometimes peaksearchBox works better, but rarely. If you are using a special peak search routine, here is where you specify it.
    piexl → Q program:
    Name of program (relative to the "binary path") that takes ouput from XY program and the geometry file and produces the list of Q hats used by indexing program. Usually use pixels2qs.
    combine program:
    Name of program (relative to the "binary path") that takes Qhat ouput from multiple detectors and combines them into one file that is used by the indexing routine. For single detector data this program is not neede. Usually use combinePeaks.
    Top

    Set → Index...


    This dialog provides the parameters used by the indexing routine. The indexing routine uses the measured Qhats from the peak searching step to determine the orientation of the crystallite(s). The indexing routine will give a list of orientations if multiple Laue patterns are present. The output list should be ordered so that the best patern is first.
    angle tolerance
    The indexing program compares all pairs of measured Qhats to all pairs of possible Qhats. But only considers those pairs which have the same angular separation to within this tolerance. It also uses this angle in the final allocation of measured Qhats to a derived orientation.
    cone angle
    central hkl
    max calc (keV)
    These three parameters determine the set of possible hkl that are used when indexing a pattern. The hkl that are considered when indexing are all hkl that are within cone angle of the central hkl, and whose energy will be below max calc. Note, max calc only limits the hkl that are used to find an orientation, the indexing process contains a second pass where spots with higher energies may be included in a pattern.
    max test (keV)
    Energy used when determining which measured spots belong to an orientation. Once an orientation has been determined (using cone angle, central hkl, and max calc) another list of hkl's is generated this time accepting energies out to max test. It is this new list that is used to identify which measured spots belong to this orientation. Note, this process is done for each orientation found in a Laue pattern.
    max no. of spots
    Limits the number of spots used to index a pattern. The indexing program will only use the first N spots from the peak search program.
    .xtl file
    File with extension .xtl containing the definition of the crystal that we are trying to index. The actual file is relative to the "path to data files" in the main dialog.

    buttons:

    Done
    Dismisses this dialog, all values entered in this dialog will be saved for use. If this button is disabled, try the Validate button.
    Cancel
    Cancel this dialog, do NOT use the values entered here, they are discarded. This is equivalent to just closing the window.

    programs:

    indexing program:
    Name of program (relative to "path to binaries") that does the indexing. Usually euler. Unless you are using some special program (which does not yet exist), use euler.
    Top

    Server → View Queue on Server


    This menu item is only available after a calculation has been started and is running; it has no input parametes. It provides information about how many jobs are currently running on nodes in the cluster, and about how many jobs are queued up for the cluster. It also gives information about the progress of the current calculation.
    running reconstructions:
    The number of reconstructions that are currenly executing on cluster nodes.
    running indexing tasks:
    The number of peaksearch/indexing tasks that are currenly executing on cluster nodes.
    pending reconstructions:
    The number of reconstructions that are waiting in the queue for a free compute node.
    pending indexing tasks:
    The number of peaksearch/indexing tasks that are waiting in the queue for a free compute node.
    other running tasks:
    The number of tasks running on the cluster that were NOT started by this application. These tasks were put on the cluster by something/someone else that we don't know about. In general this should be zero.
    other pending tasks:
    The number of tasks that are waiting in the queue for a free compute node that were NOT started by this application. These tasks were put on the cluster queue by something/someone else that we don't know about. In general this should be zero.
    A progress bar  
    Shows progress for this submitted calculation. When the blue reaches the right side it should finish.
    percent completed:
    Fraction of this calculation that has been completed. This is also the fractional length of the blue in the progress bar.
    elapsed time:
    Amount of time since this calculation was started.
    remaining time:
    Predicted amount of time needed for this calculation to finish.
    ends at:
    Time of day when this calculation is predicted to finish.
    Top

    Starting (and Stopping) the Remote Server

    This is typically an Administrator function, generally, you should not need to know about this.

    These are the commands typically used to start (and stop) the remote server.

    First open a terminal window and log into the machine that you want to run the server on (usually hpcs34.xor.aps.anl.gov), and navigate to the folder containing the server python scripts.
    Then it is usualy best (but not required) to start the server in a detachable window, so type the command:

    $screen

    This starts a new screen session which can be detached. Then actually start the server by executing the Python script, type:

    $./server.py

    The server should now be running. The terminal should tell you that output from the server is being redirected to a file named serverLog.txt in the current folder. You can look at this log file at any time from any terminal. You can now detach from this screen (and log out) without stopping the server To detach from this screen, type in:

    $[Ctrl+A] [D]

    that is just two characters, first the "Ctrl+A", and then a "D". Now the server is running in a detached screen and you can log out or do whatever you want without stopping the server.

    To re-attach to the server process (probably to kill it), log into the same machine that you started it on and type:

    $screen -r

    This will re-attach you to your previously detached screen and allow you to do things such as killing the server. To kill the server, just type one character:
    Ctrl+C

    If you want to run the server without a separate screen session, simply log in and type:
    $./server.py
    This will work fine. However, if you log out or kill the terminal window, the server will be killed, which is probably not what you want. By using the screen command you can detach and log off the computer while the server continues to run, which is what you generally want for normal operations. It should also be possible to run the server with a nohup command, but I have not yet tried that, and don't see any advantage to using it.
    Top